Skip to content

Conversation

@inikep
Copy link
Collaborator

@inikep inikep commented Dec 3, 2025

Issue: rows_in_buffer could exceed the number of rows requested by a SELECT ... LIMIT query. This could lead to unnecessary memory usage and inefficient fetching.

Fix: Adjust rows_in_buffer to respect the SELECT query's LIMIT. Now, rows_in_buffer is capped at the query limit, aligning buffer allocation with the expected number of rows.

…r` to query `LIMIT`

Issue: `rows_in_buffer` could exceed the number of rows requested by a `SELECT ... LIMIT` query.
This could lead to unnecessary memory usage and inefficient fetching.

Fix: Adjust `rows_in_buffer` to respect the `SELECT` query's `LIMIT`.
Now, `rows_in_buffer` is capped at the query limit, aligning buffer allocation with the expected number of rows.
ulonglong select_limit =
std::max<ulonglong>(q_block->select_limit->val_uint(), 2);
rows_in_buffer = std::min(rows_in_buffer, select_limit);
}
Copy link

@dinodork dinodork Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution works, and I notice a speedup of about a factor three. I don't see any serious problems with it. It's not ideal that the executor peeks into the AST tree, however. My take on the coding pattern is that this type of information should be supplied by the iterator. This is probably something they would pick on if we proposed the patch upstream.

Before the regression, set_record_buffer() would cap the number of rows in the buffer at max_rows. So a more conservative approach would be to reintroduce that cap instead of peeking at the LIMIT, I guess?

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants